Method of and system for, processing email

ABSTRACT

A system for processing emails incorporates means for dealing with previously unknown viruses. The system monitors email traffic patterns to identify patterns characteristic of a virus outbreak and takes corrective action when an outbreak is detected. Individual emails are analysed and, if any one of the constituent parts contains content in which it is possible to contain a virus, characteristic data derived from the email is logged to a database which is scanned for outbreak-indicating traffic patterns.

INTRODUCTION

[0001] The present invention relates to a method of, and system for,processing email in particular to detect virus outbreaks. The inventionis particularly, but not exclusively, applicable to processing of emailby ISPs (Internet Service Providers).

BACKGROUND OF THE INVENTION

[0002] It should be noted that some discussions of malicious softwareuse the term “virus” in a narrow sense as relating to software havingparticular characteristics in terms of propagation, possibly alsomultiplication, and effect which are distinct from other forms such as“trojan horses”, “worms”, etc. However, in this specification, includingthe appended claims, the term virus is used in the general sense of anysoftware which by malice (or accident) causes undesired effects.

[0003] Conventional virus checkers find viruses by looking for knownpatterns in files, by checking for new or changed files in a file systemor by running suspicious programs in a sandbox emulator environment todetect virus-like activity.

[0004] The increasing use of email, over both the Internet and privatenetworks, increases the exposure of individual end users and operationsto malicious disruption. Recently there have been email-borne virusoutbreaks which have spread across the world in a matter of hours. Somedegree of protection can be achieved by scanning emails and theirattachments for viruses and obviously this is best done on a centralisedbasis, e.g. by ISPs and others who operate email gateways, rather thanleaving it to end users who may or may not have the resources, knowledgeor inclination to take their own anti-virus measures.

[0005] However, even with centralised scanning there is still a problemwith new viruses. Leaving aside the question of how a new virus is firstdetected, whether by measures taken by an ISP or similar, or at an enduser's machine, the steps necessary to mitigate the effect of anoutbreak of it take time to put into effect, and by the time that theyhave been, the worst effects of the outbreak may already have occurred,all across the world. These steps typically include identifying acharacteristic string of bytes or other “signature” which identifies thevirus, disseminating this information to virus-scanning sites, andprogramming the scanners with this information, all of which takes time,and meanwhile the outbreak is free to spread.

[0006] This has become particularly problematic recently with the typeof virus which can effectively multiply itself by generating and sendingcopies of the email which contains it, e.g. by accessing an emailaddress book (e.g. that available to an end user's email client) andthen using services available on the machine to send a copy of the emailand itself to any or all of the addresses found. This tactic canpropagate between continents in a matter of minutes and result in ageometric “explosion” of the number of instances of it.

OBJECT OF THE INVENTION

[0007] The present invention seeks to reduce the problem of dealing withnew viruses borne by email.

SUMMARY OF THE INVENTION

[0008] The invention provides a method of processing email to detect thespread of previously unknown viruses which comprises monitoring emailtraffic passing through one or more nodes of a network for patterns ofemail traffic which are indicative of, or suggestive of, the spread ofan email-borne virus and, once such a pattern is detected, initiatingautomatic remedial action, alerting an operator, or both.

[0009] The invention also provides a system for processing email todetect the spread of previously unknown viruses which comprisesmonitoring email traffic passing through one or more nodes of a networkfor patterns of email traffic which are indicative of, or suggestive of,the spread of an email-borne virus and, once such a pattern is detected,initiating automatic remedial action, alerting an operator, or both.

[0010] Thus, rather than monitoring individual emails, the inventiontreats emails being processed as an “ensemble” and looks for patterns inthe traffic of email which are characteristic of viruses beingpropagated via email. It has been found that such characteristicpatterns are relatively easy to define, and to identify once they occur.

[0011] To assist in the identification of relevant patterns of emailtraffic, each email is analysed by reference to a number of criteriawhich indicate that the email may contain a virus. Any email which meetsany of these criteria may then be logged to a database. Examination ofrecent additions to this database can then be used to identify trafficpatterns indicative or suggestive of a virus outbreak.

[0012] The decision whether or not to log a particular email can betaken on the basis of whether it meets one or more criteria indicatingthat it is possible for the email to contain a virus. In other words,the criteria chosen to decide whether to log an email can be ones whichindicate that it is possible for the email to contain a virus,regardless of whether it actually does, on the basis that emails whichcannot possibly contain a virus need not be individually logged.However, the invention does not exclude the possibility that one or morecriteria seek to determine whether an email actually does contain avirus, by any suitable scanning, or other analytical, technique.

[0013] Suppose a user reports that a particular email contained a virusas an attachment, and that this is one of a number of emails that hasbeen recently processed by the system. The database will have in itentries recording items such as the sender and recipient, email subject,attachment names and sizes. It is possible, automatically (i.e. insoftware) or with human intervention to identify the relevant storedattributes of these messages and use them as the basis for taking thecorrective action in relation to subsequently processed, matching,emails. It is also possible to notify recipients of matching emailswhich have already been processed to take corrective action of theirown, e.g. to delete the email unread and unopened, assuming the systemstores the recipient name in plaintext.

DESCRIPTION OF THE DRAWINGS

[0014] The invention will be further described by way of non-limitativeexample with reference to the accompanying drawings, in which:-

[0015]FIG. 1 illustrates the process of sending an email over theInternet; and

[0016]FIG. 2 is a block diagram of one embodiment of the invention.

ILLUSTRATED EMBODIMENT

[0017] Before describing the illustrated embodiment of the invention, atypical process of sending an email over the Internet will briefly bedescribed with reference to FIG. 1. This is purely for illustration;there are several methods for delivering and receiving email on theInternet, including, but not limited to: end-to-end SMTP, IMAP4 andUCCP. There are also other ways of achieving SMTP to POP3 email,including for instance, using an ISDN or leased line connection insteadof a dial-up modem connection.

[0018] Suppose a user 1A with an email ID “asender” has his account at“asource.com” wishes to send an email to someone 1B with an account“arecipient” at “adestination.com”, and that these .com domains aremaintained by respective ISPs (Internet Service Providers). Each of thedomains has a mail server 2A,2B which includes one or more SMTP servers3A,3B for outbound messages and one or more POP3 servers 4A,4B forinbound ones. These domains form part of the Internet which for clarityis indicated separately at 5. The process proceeds as follows:

[0019] 1. Asender prepares the email message using email client software1A such as Microsoft Outlook Express and addresses it to“arecipient@adestination.com”.

[0020] 2. Using a dial-up modem connection or similar, asender's emailclient 1A connects to the email server 2A at “mail.asource.com”.

[0021] 3. Asender's email client 1A conducts a conversation with theSMTP server 3A, in the course of which it tells the SMTP server 3A theaddresses of the sender and recipient and sends it the body of themessage (including any attachments) thus transferring the email 10 tothe server 3A.

[0022] 4. The SMTP server 3A parses the TO field of the email envelopeinto a) the recipient and b) the recipient's domain name. It is assumedfor the present purposes that the sender's and recipients' ISPs aredifferent, otherwise the SMTP server 3A could simply route the emailthrough to its associated POP3 server(s) 4A for subsequent collection.

[0023] 5. The SMTP server 3A locates an Internet Domain Name server andobtains an IP address for the destination domain's mail server.

[0024] 6. The SMTP server 3A connects to the SMTP server 3B at“adestination.com” via SMTP and sends it the sender and recipientaddresses and message body similarly to Step 3.

[0025] 7. The SMTP server 3B recognises that the domain name refers toitself, and passes the message to “adestination”'s POP3 server 4B, whichputs the message in “arecipient”'s mailbox for collection by therecipients email client 1B.

[0026] There are various ways in which email can be used to maliciouseffect, probably the most widely known being a virus which travels withthe email as an attachment. Typically, the recipient “opening” theattachment, as by double-clicking it, allows the virus which may be abinary executable or scripting code written to an interpreter hosted bythe email client or the operating system, to execute. Neither theproblem of malicious intent, nor the present invention's solution to it,is restricted to viruses of this type. For example other maliciousattacks may involve exploiting weaknesses of the delivery system(SMTP+POP3) or the email client, as by deliberately formatting an emailheader field in a way which is known to cause misoperation of softwarewhich processes it.

[0027] Referring now to FIG. 2, this shows in block form the keysub-systems of an embodiment of the present invention. In the exampleunder consideration, i.e. the processing of email by an ISP, thesesubsystems are implemented by software executing on the ISP'scomputer(s). These computers operate one or more email gateways 20A . .. 20N passing email messages such as 10.

[0028] The various subsystems of the embodiment will be described inmore detail below but briefly comprise;

[0029] a message decomposer/analyser 21 which decomposes emails intotheir constituent parts and analyses them to assess whether they arecandidates for logging;

[0030] a logger 22 which prepares a database entry for each messageselected as a logging candidate by the decomposer/analyser 21;

[0031] a database 23 which stores the entries prepared by logger 22;

[0032] a searcher 24 which scans new entries in the database 23 lookingfor signs of virus-bearing traffic;

[0033] a stopper 25 which signals the results from the searcher 24 andoptionally stops the passage of emails which conform to criteria of thedecomposer/analyser 21 as indicating a virus threat.

[0034] The stopper 25 can be implemented in such a way that emails whichare processed by the system and are not considered to be infected with avirus can have a text notification inserted in them, e.g. appended tothe message text, saying that the email has been scanned by the system,so that the recipient will be able to see that it has.

[0035] Overall, the system of FIG. 2 works on the following principles.

[0036] Viruses that spread by email can be detected by examining thetraffic patterns of the emails they create.

[0037] The illustrated embodiment applies a set of heuristics toidentify email viruses. The following is a non-exhaustive list ofcriteria by which emails may be assessed to implement these heuristics.Other criteria may be used as well or instead:-

[0038] They contain the same or similar subject lines;

[0039] They contain the same or similar body texts;

[0040] They contain the same named attachment;

[0041] They contain an attachment with the same message digest;

[0042] They are addressed to many recipients;

[0043] They are addressed to recipients in alphabetical, or reversealphabetical order;

[0044] They are sent to a particular email address, and then exitmultiply from the same email address, and/or similar:email addresses;

[0045] They contain the same structural format;

[0046] They contain the same structural quirks;

[0047] They contain the same unusual message headers.

[0048] The above criteria should be self-explanatory, except possiblythose which refer to “message digest” and “structural quirks”; thoseexpressions are explained below.

[0049] Each of the above criteria is assigned a numerical score. Eachemail that passes through the system is analysed by thedecomposer/analyser 21, and logged in a database 23 by logger 22. Asearch routine executed by searcher 24 continually analyses the newinformation being stored in the database to see if similar messages arebeing sent. If they are, then the ‘suspiciousness’ of the email iscalculated using an algorithm which takes into account how similar themessages are, and also how many of them have been received recently.Once a threshold has been passed, all new messages that match thecriteria are stopped as potential viruses by stopper 25, and an alarm israised.

[0050] The system may generate a message digest, at least for thosemessages which are logged in the database. Message digests are aconvenient and efficient means of identifying messages with the samemessage text and as a “handle” by which to retrieve a collection of logentries which represent the same message text being sent in multipleemails. The digest may be stored in the database in addition to, orinstead of, the message list.

[0051] A message digest is typically created by applying a one wayhashing algorithm (such as MD5 or Message-Digest-5) to a series ofcharacters (in the present case, for example, the characters of amessage). The advantages of using a digest in this application are:

[0052] They are typically smaller than the original message, and are offixed length, so they can be stored in a database more easily;

[0053] They are typically one-way functions, so the original messagecannot be recreated, thus preserving customer confidentiality;

[0054] A small change in the message will result in a completelydifferent digest.

[0055] For instance, the MD5 digest of “The rain in spain falls mainlyon the plain” is 6f7f4c35a219625efc5a9ebad8fa8527 and of “The rain inSpain falls mainly on the plain” is b417b67704f2dd2b5a812f99ade30e00.These two messages differ only by one bit (the ‘s’ is Spain, since acapital S is one bit different to a lowercase s in the ASCII characterset), but the digests are totally different.

[0056] Some examples of the criteria by which emails may be assessedwill now be given:

[0057] Structural quirks: Most emails are generated by tried and testedapplications. These applications will always generate email in aparticular way. It is often possible to identify which applicationgenerated a particular email by examining the email headers and also beexamining the format of the different parts. It is then possible toidentify emails which contain quirks which either indicate that theemail is attempting to look as if it was generated by a known emailer,but was not, or that it was generated by a new and unknown mailer, or byan application (which could be a virus or worm). All are suspicious.

EXAMPLES

[0058] Inconsistent Capitalisation

[0059] from: alex@star.co.uk

[0060] To: alex@star.co.uk

[0061] The from and to have different capitalisation

[0062] Non-Standard Ordering of Header Elements

[0063] Subject: Tower fault tolerance

[0064] Content-type: multipart/mixed; boundary=“======_(—)962609498==_”

[0065] Mime-Version: 1.0

[0066] The Mime-Version header normally comes before the Content-Typeheader.

[0067] Missing or Additional Header Elements

[0068] X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.5 (32)

[0069] Date: Mon, 3 Jul. 2000 12:24:17 +0100

[0070] Eudora normally also includes an X-Sender header

[0071] Message ID Format

[0072] Message-ID:<00270ibfe4elSb37dbdc0S9264010a@tomkins.int.star.co.uk>

[0073] X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.5 (32)

[0074] The X-mailer header says the mail is generated by Eudora, but themessage-id format is an Outlook message-id, not a Eudora message-id.

[0075] Boundary Format

[0076] X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0

[0077] Content-Type: multipart/mixed; boundary=“======_(—)962609498==_”

[0078] The X-mailer header says the mail is generated by Outlook, butthe boundary format is a Eudora boundary, not an Outlook boundary.

[0079] Line Break and Other White Space Composition in Message Header

[0080] To: “Andrew Webley” <awebley@messagelabs.com>,

[0081] “Matt Cave” <MCave@messagelabs.com>,

[0082] “Alex at MessageLabs” ashipp@messagelabs.com

[0083] X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.5 (32)

[0084] The e-mailer (Eudora) normally uses a single space, and no tabsfor continuation lines.

[0085] It Originates from Particular IP Addresses or IP Address Ranges.

[0086] The IP address of the originator is, of course, known and hencecan be used to determine whether this criterion is met.

[0087] It Contains Specialised Constructs

[0088] Some email uses HTML script to encrypt the message content. Thisis intended to defeat linguistic analysers. When the mail is viewed in amail client such as Outlook, the text is immediately decrypted anddisplayed. It would be unusual for a normal email to do this.

[0089] Empty Message Sender Envelopes

[0090] An email normally indicates the originator in the Sender textfield and spam originators will often put a bogus entry in that field todisguise the fact that the email is infected. However, the Senderidentity is also supposed to be specified in the protocol under whichSMTP processes talk to one another in the transfer of email, and thiscriterion is concerned with the absence of the sender identificationfrom the relevant protocol slot, namely the Mail From protocol slot.

[0091] Invalid Message Sender email Addresses

[0092] This is complementary to item 8 and involves consideration ofboth the sender field of the message and the sender protocol slot, as towhether it is invalid. The email may come from a domain which does notexist or does not follow the normal rules for the domain. For instance,a HotMail address of “123@hotmail.com” is invalid because HotMailaddresses cannot be all numbers.

[0093] A number of fields of the email may be examined for invalidentries, including “Sender”, “From”, and “Errors-to”.

[0094] Message Sender Addresses which do not Match the Mail Server fromwhich the Mail is Sent.

[0095] The local mail server knows, or at least can find out from theprotocol, the address of the mail sender, and so a determination can bemade of whether this matches the sender address in the mail text.

[0096] In an actual implementation of the system of FIG. 2 a network ofemail gateways 20 is preferred, so that email can be processed on therequired scale. The more widespread this network, and the more emailprocessed, then the greater the chances of being able to intercept newviruses, recognise the symptoms, and stop further occurrences before thevirus becomes too widespread. However, use of a number of email gatewaysis not an essential component of the system; the system is able torecognise and detect new viruses even if only one email gateway is used,and if even a small amount of email passes through it.

[0097] All email is passed through the analyser/decomposer 21 in whichemail is broken into its constituent parts. For the purposes of trafficheuristics, each part is classified as:

[0098] The email header/mime headers;

[0099] A component normally considered part of the message;

[0100] A component normally considered as an attachment.

[0101] Each part is then further analysed to see if it has thepossibility of containing potential threats.

[0102] Email header/mime headers: Overlong lines, or lines with unusualsyntax may be used to crash particular browsers, causing either a denialof service attack or an exploit which can cause a security breach orspread a virus.

[0103] A component normally considered part of the message: These maycontain embedded executable code. For instance, an HTML message maycontain scripting code in various computer languages, or it may containelements (such as <frameset> or <object> tags) which have been shown tobe exploitable.

[0104] A component normally considered as an attachment: These may bedirectly executable, such as an EXE file. They may contain embeddedexecutable code, such as a Microsoft Word document containing a macro.They may contain archive file or other container files, which themselvesmay contain other dangerous components. For instance, a ZIP file maycontain an executable.

[0105] Normally, the attachment must contain some executable element tobe viewed as a potential threat. However, the system is capable of beingtoggled into a mode where it views all attachments as a potentialthreat. This is to cater for two possibilities such as:

[0106] A document, such as a jpg picture, may contain illegal formattingthat crashes the application used to view the attachment. This can causeeither a denial of service attack, or an exploit which can cause asecurity breach or spread a virus.

[0107] The message body may contain instructions which, if followed,turn the attachment into a dangerous form, e.g. ‘rename picture.jpg topicture.exe’.

[0108] After analysing each component, then if any one component has thepossibility of containing a potential threat, the message is logged bythe logger 22 in the database 23. Otherwise, the message is not logged.

[0109] The logger 22 is programmed so that the system logs components ofeach. message so that similar messages can be detected. The followingare logged:

[0110] Subject line and digest of subject line;

[0111] First few characters of text part of email, digest of first textpart, and digest of first few characters;

[0112] Name of first attachment;

[0113] Digest of first attachment;

[0114] Number of recipients;

[0115] Whether recipients are in alphabetical order, or reversealphabetical order;

[0116] Time of logging;

[0117] Digest of sender;

[0118] Digest of first recipient;

[0119] Structural format indicators;

[0120] Structural quirk indicators;

[0121] Unusual message headers;

[0122] Time email arrived.

[0123] The above list is not exhaustive, and the invention is notrestricted to this particular combination of information items.

[0124] The database 23 logs details about messages, and allows queryingof the details to find patterns of duplicate or similar emails.

[0125] In order to provide responsiveness, logging may be a one tier orseveral tier operation. For instance, messages may be logged locally ina database geographically near to the email servers, and analysedlocally. This gives a quick response to local traffic patterns. However,the logs may also be copied back to a central database to perform globalanalysis. This will be slower to react, but will be able to react onglobal, rather than local patterns.

[0126] Old log entries are automatically deleted from the database 23since they are no longer needed—the system is designed to provide anearly warning of new viruses.

[0127] The searcher 24 periodically queries the database searching forrecent similar messages and generating a score by analysing thecomponents. Depending on the score, the system may identify a ‘definite’threat or a ‘potential’ threat. A definite threat causes a signature tobe sent back to the stopper so that all future messages with thatcharacteristic are stopped. A potential threat causes an alert to besent to an operator who can then decide to treat as if it were adefinite threat, to flag as a false alarm so no future occurrences arereported, or to wait and see.

[0128] The searcher can be configured with different parameters, so thatit can be more sensitive if searching logs from a single email gateway,and less sensitive if processing a database of world-wide information.

[0129] Each criterion can be associated a different score.

[0130] The time between searches can be adjusted.

[0131] The time span each search covers can be adjusted and multipletime spans accommodated.

[0132] Overall thresholds can be set.

[0133] The stopper 25 takes signatures from the searcher 24. Thesignature identifies characteristics of emails which must be stopped. Onreceiving the signature, all future matching emails are treated asviruses, and stopped.

[0134] Obviously, the stopping action can take a number of forms,including

[0135] Disposing of the infected emails without sending them to theiraddressed recipients.

[0136] Holding them in temporary storage and notifying the addressee byemail that an infected message has been intercepted and is being heldfor a period for their retrieval, should they wish, otherwise it will bedeleted.

[0137] Disinfecting the email by removing the virus threat by anysuitable means; for example if the virus is an executable attachment, itcan be detached or disarmed before forwarding the email to itsaddressees. The email may be modified by the inclusion of a text messagesaying that the email has been disinfected.

[0138] Where a virus is detected, an automated mail server 30 may notifyother sites of the relevant characteristics of the infected emails,either to alert human operators or to supply embodiments of theinvention at remote sites with the characteristics of the emailsnecessary for their stoppers 25 to stop them.

[0139] Typical Algorithm

[0140] The following is one possible algorithm which can be implementedby the searcher 24 in an illustrated embodiment of the invention

[0141] Referring to the example email-assessment criteria set out above,it will be appreciated that an email under consideration has a number ofattributes which can be represented as data values in a computerprogram, with the data type depending on the nature of the attribute.For example, the length of the message and number of attachments areintegers, whereas the various text headers (e.g. To, SendTo, Subject)are character strings, as are digests such as the message digest. In thefollowing, emails are considered to be equal according to a givencriterion if the corresponding attributes are equal in the cases ofintegers and character strings. In the case of character strings, whereappropriate, equality can be determined by a case-insensitivecomparison; case-insensitive comparisons are appropriate for the textualfields of an email, but not necessarily for other character strings. (Inthe case of an attribute represented by a floating point value, theskilled man will be aware that comparisons should be done on the basisof whether the absolute value of the difference is greater than somesmall arbitrary value, sometimes referred to as “epsilon” in thetechnical literature, which is itself greater than the rounding error).

[0142] Below, the numbers in brackets are step numbers to identify thesteps carried out.

[0143] At regular intervals (100):

[0144] For each criterion A we are measuring (110)

[0145] For each time interval B minutes we are measuring (200)

[0146] Get sample set S of emails over last B minutes where their valueaccording to a selected criterion A is equal (210). Partition the sampleset if it contains values which cannot be the same virus (for instance,if some emails in the set contain a HTML script, and some contain an EXEthese cannot be the same virus, and should each be treated as a separateset S per step 210)

[0147] For each sample set S (300)

[0148] Set X=count of mails in sample set (310)

[0149] Multiply X from step 310 by an importance factor C for criterionA (320). Each criterion has a respective importance factor which dependson the nature of the criterion, since some criteria, e.g. the name of afile attachment may be more significant than others so far as assessingthe likelihood of a virus threat is concerned; similar comments apply tothe other factors mentioned below)

[0150] Add to X from step 320 a second-importance factor D for eachother criterion A2, where A2 is also equal over the sample set S (330)

[0151] Add to X from step 330 a third importance factor E for each othercriterion A3, where A3 has a limited set of different values over thesample set S (340). “Limited range” means >1 and <R. Each time intervalB has a respective R.

[0152] Add to X from step 340 a spread factor (P times T) if the sampleset contains emails entering a domain, and then T copies leaving thedomain (where T>Q) (350). Each time interval B has a different P and Q)

[0153] If X from step 350 is greater than threshold V (each timeinterval B has a respective threshold V) then flag as virus. (360)

[0154] Else

[0155] If X from step 350 is greater than threshold O (each timeinterval B has a respective threshold O), where O is less than V, thenflag as needing operator assistance (370). The operator can then assesswhether a virus threat is present or not and instruct the software toproceed accordingly

[0156] Next sample set (380)

[0157] Next interval (210)

[0158] Next criterion (120)

[0159] Note that the three “importance” factors C, D, E, the spreadfactor and thresholds are numeric values which may be set empiricallyand may be adjusted dynamically. Also, the algorithm may be carried outusing one or more different values for the time interval B, e.g. 5minutes, 30 minutes and 180 minutes.

[0160] In English: we are looking for emails with similarcharacteristics arriving in a given time period. The more similar emailswe find, the more suspicious we become. If the emails also have othercharacteristics in common, this makes us even more suspicious.

[0161] Some things may be more suspicious than others—for instance wemay choose to allocate a higher.score if we see emails with the samenamed attachment, than if we see emails with the same subject line.

[0162] If we see emails being sent to one domain, and then come floodingout, this is also suspicious.

[0163] Although, in the above, the invention has been described byreference to its application to Internet email, it is not restricted tosuch email; the invention is equally applicable to other private orpublic, local- or wide-area network or combinations of such networkswith one another and with the Internet, as well as to email over WAP(Wireless Access Protocol) and SMS (Simple Messaging Service) for mobiletelephones and similar devices.

1. A method of processing email to detect the spread of previouslyunknown viruses which comprises monitoring email traffic passing throughone or more nodes of a network for patterns of email traffic which areindicative of, or suggestive of, the spread of an email-borne virus and,once such a pattern is detected, initiating automatic remedial action,alerting an operator, or both.
 2. A method according to claim 1 whichcomprises decomposing each email into its constituent parts, analysingone or more of the decomposed constituent parts for content taken to beindicative of a potential virus and logging data of the decomposed emailto a database.
 3. A method according to claim 2, wherein data is loggedonly in respect of email which, on analysis, meets at least onecriterion indicating that it is possible for the email to contain avirus.
 4. A method according to claim 3, wherein data is logged inrespect of email which, on analysis, meets any of a number of criteriaindicating that it is possible for the email to contain a virus.
 5. Amethod according to claim 2, 3 or 4 and including the step ofcontinually or continuously executing an algorithm against entries in adatabase to identify patterns of email traffic taken to be indicative ofa virus outbreak.
 6. A method according to claim 5, wherein the databasealgorithm examines, principally or exclusively, only “recently” addeddatabase entries, i.e. entries which have been added less than apredetermined time ago.
 7. A method according to any one of thepreceding claims wherein the corrective action includes any or all ofthe following, in relation to each email which conforms to the detectedpattern: a) at least temporarily stopping the passage of the emails b)notifying the sender c) notifying the intended recipient(s) d)disinfecting the email e) generating a signal to alert a human operator.8. A method according to any one of claims 1 to 7 and including the stepof forwarding emails which are taken to be infected, to theiraddressees.
 9. A method according to any one of claims 1 to 8 andincluding sending a message identifying suspect emails to an automatedemail server.
 10. A method according to any one of claims 1 to 9 andincluding the step of processing infected emails to disinfect them or todisarm a virus therein.
 11. A method according to any one of claims 1 to10 and including the step of inserting in emails not taken to be virusinfected, a message indicating that the email has been processed
 12. Asystem for processing email to detect the spread of previously unknownviruses which comprises monitoring email traffic passing through one ormore nodes of a network for patterns of email traffic which areindicative of, or suggestive of, the spread of an email-borne virus and,once such a pattern is detected, initiating automatic remedial action,alerting an operator, or both.
 13. A system according to claim 12 whichcomprises decomposing each email into its constituent parts, analysingone or more of the decomposed constituent parts for content taken to beindicative of a potential virus and logging data of the decomposed emailto a database.
 14. A system according to claim, 12 wherein data islogged only in respect of email which, on analysis, meet at least onecriterion indicating that it is possible for the email to contain avirus.
 15. A system according to claim, 14 wherein data is logged inrespect of email which, on analysis, meets any of a number of criteriaindicating that it is possible for the email to contain a virus.
 16. Asystem according to claim 12, 13 or 14 and including the step ofcontinually or continuously executing an algorithm against entries inthe database to identify patterns of email traffic taken to beindicative of a virus outbreak.
 17. A system according to claim, 16wherein the database algorithm examines, principally or exclusively,only “recently” added database entries, i.e. entries which have beenadded less than a predetermined time ago.
 18. A system according to anyone of claims 12 to 17 wherein the corrective action includes any or allof the following, in relation to each email which conforms to thedetected pattern: a) at least temporarily stopping the passage of theemails b) notifying the sender c) notifying the intended recipient(s) d)disinfecting the email e) generating a signal to alert a human operator.19. A method according to any one of claims 12 to 18 and including meansfor forwarding emails which are taken to be infected, to theiraddressees.
 20. A method according to any one of claims 12 to 19 andincluding sending a message identifying suspect emails to an automatedemail server.
 21. A method according to any one of claims 12 to 20 andincluding means for processing infected emails to disinfect them or todisarm a virus therein.
 22. A method according to any one of claims 12to 21 and including means for inserting in emails not taken to be virusinfected, a message indicating that the email has been processed.